Transforming a Constituency Treebank into a Dependency Treebank
نویسندگان
چکیده
We present a heuristic technique for converting a constituency treebank into a dependency treebank. In particular, we comment on our experience in converting the Spanish treebank Cast3LB. We extract a context-free grammar from the treebank, automatically identify the head in each rule, and use this information for constructing the dependency tree. Our heuristics have 99% precision and 80% recall in identifying the head in the rules, which gives 92% accuracy in identifying dependencies between words.
منابع مشابه
تبدیل خودکار درختبانک وابستگی فارسی به درختبانک سازهای
There are two major types of treebanks: dependency-based and constituency-based. Both of them have applications in natural language processing and computational linguistics. Several dependency treebanks have been developed for Persian. However, there is no available big size constituency treebank for this language. In this paper, we aim to propose an algorithm for automatic conversion of a depe...
متن کاملStatistical French Dependency Parsing: Treebank Conversion and First Results
We first describe the automatic conversion of the French Treebank (Abeillé and Barrier, 2004), a constituency treebank, into typed projective dependency trees. In order to evaluate the overall quality of the resulting dependency treebank, and to quantify the cases where the projectivity constraint leads to wrong dependencies, we compare a subset of the converted treebank to manually validated d...
متن کاملConverting SynTagRus Dependency Treebank into Penn Treebank Style
This paper presents the conversion of SynTagRus dependency structures into Penn Treebank style phrase structures, whose resulting data will be used to train a statistical constituency parser for Russian and create a large-scale constituency-parsed corpus. The implemented conversion includes various innovative features in order to create phrase structure trees that are closest to Penn Treebank s...
متن کاملAn Empirical Evaluation of Automatic Conversion from Constituency to Dependency in Hungarian
In this paper, we investigate the differences between Hungarian sentence parses based on automatically converted and manually annotated dependency trees. We also train constituency parsers on the manually annotated constituency treebank and then convert their output to dependency trees. We argue for the importance of training on gold standard corpora, and we also demonstrate that although the r...
متن کاملEvalita’09 Parsing Task: constituency parsers and the Penn format for Italian
The aim of Evalita Parsing Task is at defining and extending the state of the art for parsing Italian by encouraging the application of existing models and approaches. Therefore, as in the first edition, the Task includes two tracks, i.e. dependency and constituency. This second track is based on a development set in a format, which is an adaptation for Italian of the Penn Treebank format, and ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Procesamiento del Lenguaje Natural
دوره 35 شماره
صفحات -
تاریخ انتشار 2005